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Abstract — This paper demonstrates how to apply prognostics to 
power MOSFETs (metal oxide field effect transistor). The 
methodology uses thermal cycling to age devices and Gaussian 
process regression to perform prognostics. The approach is 
validated with experiments on 100V power MOSFETs. The 
failure mechanism for the stress conditions is determined to be 
die-attachment degradation. Change in ON-state resistance is 
used as a precursor of failure due to its dependence on junction 
temperature. The experimental data is augmented with a finite 
element analysis simulation that is based on a two-transistor 
model. The simulation assists in the interpretation of the 
degradation phenomena and SOA (safe operation area) change. 

I. Introduction 

Prognostics is an engineering discipline focused on 
predicting the time at which an in-service component will fail 
or no longer perform its intended function. Predictions are 
made in-situ on individual in-service components. This is in 
contrast to statistical reliability methods that produce mostly a 
priori life estimates. The science of prognostics is based on the 
analysis of failure modes, detection of early signs of wear and 
aging, and fault conditions. These signs are then correlated 
with a damage propagation model and suitable prediction 
algorithms to arrive at a remaining useful life (RUL) estimate. 
The discipline that links studies of failure mechanisms to 
system lifecycle management is often referred to as 
prognostics and health management (PHM). PHM techniques 
have recently enjoyed considerable attention, for example, in 
the aerospace domain where the assessment of in-situ health 
of components and subsystem enables safe operations. 
Although the emphasis in PHM has so far been on mechanical 
components, the ability to perform health assessment of 
electronic components becomes essential as more safety- 
critical functionality is assumed by electronics. To that end, an 
in-depth understanding of aging mechanism and their 
manifestation is vital. The work reported here contributes to 
this undertaking. 


In this paper a prognostics technique is presented for a 
power MOSFET based on an accelerated aging methodology. 
The methodology utilizes thermal and power cycling and was 
validated with tests using 100V power MOSFET devices. The 
major failure mechanism for the stress conditions is die- 
attachment degradation, typical for discrete devices with lead- 
free solder die attachment. It has been identified that ON-state 
resistance changes due to its dependence on junction 
temperature and can be used as a precursor of failure for the 
die-attach failure mechanism in the stress conditions. It has 
been shown that this particular degradation process provides 
characteristics to which data-driven prognostics algorithm can 
be applied. The experimental data is supported by a finite 
element analysis simulation. The numerical simulation 
assumes a two-transistor model. Results are used to interpret 
the phenomena of device degradation and SOA change. A 
Gaussian process regression framework is used for prediction 
of time to failure. The features used in the algorithm are based 
on normalized ON-resistance computed from in-situ 
measurements of the electro-thermal response. Results are 
presented from experiments on power MOSFET IRF520Npbf 
in a TO-220 package. The choice of the particular component 
is mainly due to its common use in switched mode power 
supplies in aerospace systems like radars and navigation 
equipment. 

A. Related work 

In [1] a model-based prognostics approach for discrete 
IGBTs was presented. RUE prediction was accomplished 
using a particle filter algorithm where the collector-emitter 
current leakage has been used as the primary precursor of 
failure. A prognostics approach for power MOSFETs was 
presented in [2]. There, the threshold voltage was used as a 
precursor of failure; a particle filter was used in conjunction 
with an empirical degradation model. The latter was based on 
accelerated life test data. 
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Identification of parameters that indicate precursors to 
failure for discrete power MOSFETs and IGBTs have received 
considerable attention in the recent years. Several studies have 
focused on precursor of failure parameters for discrete IGBTs 
under thermal degradation due to power cycling overstress. In 
[3], collector-emitter voltage was identified as a health 
indicator; in [4], the maximum peak of the collector-emitter 
ringing at the turn of the transient was identified as the 
degradation variable; in [5] the switching turn-off time was 
recognized as failure precursor; and switching ringing was 
used in [6] to characterize degradation. For discrete power 
MOSFETs, on-resistance was identified as a precursor of 
failure for the die-solder degradation failure mechanism 
[7] [8]. A shift in threshold voltage was named as failure 
precursor due to gate structure degradation fault mode [2] [9]. 

There have been some efforts in the development of 
degradation models that are a function of the usage/aging time 
based on accelerated life test. For example, empirical 
degradation models for model-based prognostics are presented 
in [1] and [2] for discrete IGBTs and power MOSFET 
respectively. Gate structure degradation modeling discrete 
power MOSFETs under ion impurities was presented in [10]. 

II. Accelerated Aging Experiments 

Accelerated aging approaches provide a number of 
opportunities for the development of physics-based 
prognostics models for electronics components and systems. 
In particular, it allows for the assessment of reliability in a 
considerably shorter amount of time than running long-term 
reliability tests. The development of prognostics algorithms 
face some of the same constrains as reliability engineering in 
that both need information about failure events of critical 
electronics systems. These data are rarely ever available. In 
addition, prognostics requires information about the 
degradation process leading to an irreversible failure; 
therefore, it is necessary to record in-situ measurements of key 
output variables and observable parameters in the accelerated 
aging process in order to develop and learn failure progression 
models. 

Thermal cycling overstress leads to thermo-mechanical 
stresses in electronics due to mismatch of the coefficient of 
thermal expansion between different elements in the 
component’s packaged structure. The accelerated aging 
applied to the devices presented in this work consists of 
thermal overstress. Latch-up, thermal runaway, or failure to 
turn ON due to loss of gate control are considered as the 
failure conditions. Thermal cycles were induced by power 
cycling the devices without the use of an external heat sink. 
The device case temperature was measured and directly used 
as control variable for the thermal cycling application. For 
power cycling, the applied gate voltage was a square wave 
signal with an amplitude of -15 V, a frequency of lKHz and a 
duty cycle of 40%. The drain-source was biased at 4Vdc and a 
resistive load of 0.2Q was used on the collector side output of 
the device. The aging system used for these experiments is 
described in detail in [4]. The accelerated aging methodology 
used for these experiments is presented in detail in [8]. 

Figure 1 shows an X-ray image of the device after 
degradation. It can be observed that the die solder has 


migrated and that voids have formed. This confirms that the 
thermal resistance from junction to case has increased during 
the stress time resulting in increase of the junction temperature 
and ON-resistance. Figure 2 presents a plot of the measured 
Rds(on) as a function of case temperature for several 
consecutive aging tests on the same device. For each test run, 
the temperature of the device is increased from room 
temperature to a high temperature setting thus providing the 
opportunity to characterize Rds(on) as a function of time at 
different degradation stages. It can be observed how this curve 
shifts as a function of aging time, which is indicative of an 
increased junction temperature due to poor heat dissipation 
and hence degraded die-attach. 
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Figure 1 . X-ray of degraded device. 
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Figure 2. Rdson degradation process due to die-attach damage. 

III. Modeling 
A. Mixed-mode simulation 

Numerical analysis was performed using a finite element 
model (FEM) representation of the device under consideration 
(figure 3). This numerical analysis provided I-V 
characteristics at different values of gate bias Vgs for a device 
with generic simulation parameters roughly close to tested 
devices. 

The electrical response was obtained with a mixed-mode 
circuit-device simulation using software DECIMM™ from 
Angstrom Designs Automation [11] [12]. The mixed-mode 
circuit presented in figure 4 is simulated in conjunction with 
the FEM of the MOSFETs. This was implemented both with a 
single transistor and with two transistors as shown in Figure 4. 
Results for both models are discussed below. A voltage- 
controlled voltage source circuit was used to auto bias the gate 
voltage. This prevents the device from running outside the 
SOA. 
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Figure 3. Finite element model vertical DMOS device cross-section. 



Figure 4. Two-transistor mixed-mode simulation circuit. 




Figure 5. Single transistor with auto bias reference (uV): a) SOA, b) 
extracted maximum temperature. 

The electro-thermal SOA for a single transistor mixed- 
mode simulation with auto bias control of the gate voltage is 
presented in figure 5 for two conditions a) slow transient pulse 
with constant gate bias; b) slow transient pulse with auto bias 
circuit. The observed instability points represent the critical 
voltages and currents limiting the safe operation area of the 
electrical regime. 


B. Two transistor degradation model 

The two-transistor model physically represents the device 
with partial area die-attachment degradation. The first 
transistor has original default parameters including the thermal 
resistance R T1 and area factor 90% while the second transistor 
depicts degradation due to electro-thermal stress represented 
by 10% of area with deviation of the thermal resistance 
coefficient K (figure 4). As can be seen from the simulation 
results in figure 6, even a small deviation in the thermal 
resistance of the second transistor ( R T2 =KxR T1 ) results in 
significant reduction of the critical voltage in auto bias 
conditions (figure 6). 
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Figure 6. Results of numerical analysis for different thermal resistance 
parameters of the 10% second transistor model region at 450K heat sink 
R T2 =KxRti. 


IV. Prediction of Remaining U seful Life 

Gaussian Process Regression (GPR) is a data-driven 
technique that can be used to estimate future fault degradation 
based on training data collected from measurement data. First, 
a prior distribution is assumed for the underlying process 
function that may be derived from domain knowledge [13]. 
Then this prior is tuned to fit available measurements which is 
used with the probabilistic function for regression over the 
training points [14]. The output is a mean function to describe 
the behavior and a covariance function to describe the 
uncertainty. These functions can then be used to predict a 
mean value and corresponding variance for a given future 
point of interest. The behavior of a dynamic process is 
captured in the covariance function chosen for the Gaussian 
process. The covariance structure also incorporates prior 
beliefs of the underlying system noise. A covariance function 





consist of various hyper-parameters that define its properties. 
Proper tuning of these hyper-parameters is key in the 
performance. While a user typically needs to specify the type 
of covariance function, the corresponding hyper-parameters 
can be learned from training data using a gradient based 
optimization (or other optimization) such as maximizing the 
marginal likelihood of the observed data with respect to 
hyper-parameters [14]. 

In the application here, the ON-resistance was computed 
as the ratio of voltage and currents between the drain and 
source terminals of the device. By estimating the relationship 
between operational temperature and ON-resistance of the 
device, the computed ON-resistance was normalized to 
eliminate temperature effects. The signal was filtered by 
computing the mean of every one minute long window. Since 
the complexity of GPR is 0(n 3 ), computational effort 
increases with number of data points and hence it is important 
to keep the number of training points low. Therefore a 
uniform sampling of the curve was carried out to select the 
desired number of training points to train the GPR and make 
predictions. This process was repeated 35 times and the 
results were aggregated to produce final prediction. As shown 
in the figure, predictions were made at four (somewhat 
arbitrarily chosen) time instances: 160, 180, 200, and 220 

minutes into aging. Subtracting the time when the prediction 
was made from the time when the dashed lines crosses the 
failure threshold gives the estimated remaining component 
life. As more data becomes available, the predictions become 
more accurate (as indicated by the proximity of the predicted 
dashed lines to the crossing of the failure threshold by the ON 
resistance) and the prediction spread becomes more precise 
(uncertainty cones are more narrow for later predictions). 
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Figure 7. Prediction of RUL for aged device using Gaussian process 
regression technique. 

V. Discussion 

The proposed prognostics technique reports on preliminary 
work that serves as a case study on the prediction of remaining 
life of power MOSFETs. There are several strong assumptions 
that need to be challenged in order to make the proposed 
process practical. For instance, the future operational 
conditions and loading of the device are considered constant at 
the same magnitudes as the loads and conditions used during 
accelerated aging. In addition, the algorithm development is 
conducted using accelerated life test data. In real world 


implementation, the degradation process of the device would 
occur in a considerably larger time scale. This is a topic of 
future work. 

The proposed two-transistor model is shown to be a good 
candidate for a degradation model for model-based 
prognostics. The model parameters K, and W 1 could be varied 
as the device degrades as a function of usage time, loading and 
environmental conditions. Parameter W1 defines the area of 
the healthy transistors, the lower this area, the larger the 
degradation in the two-transistor model. In addition, parameter 
K serves as a scaling factor for the thermal resistance of the 
degraded transistors, the larger this factor, the larger the 
degradation in the model. 
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